AITopics | minimizing inverse dynamic disagreement

Imitation Learning from Observations by Minimizing Inverse Dynamics Disagreement

Neural Information Processing SystemsDec-26-2025, 03:11:17 GMT

In contrast to Learning from Demonstration (LfD) that involves both action and state supervisions, LfO is more practical in leveraging previously inapplicable resources (e.g., videos), yet more challenging due to the incomplete expert guidance. In this paper, we investigate LfO and its difference with LfD in both theoretical and practical perspectives. We first prove that the gap between LfD and LfO actually lies in the disagreement of inverse dynamics models between the imitator and expert, if following the modeling approach of GAIL. More importantly, the upper bound of this gap is revealed by a negative causal entropy which can be minimized in a model-free way.

imitation learning, minimizing inverse dynamic disagreement, name change, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.61)

Add feedback

Reviews: Imitation Learning from Observations by Minimizing Inverse Dynamics Disagreement

Neural Information Processing SystemsFeb-4-2025, 23:37:42 GMT

I was happy with your inclusion of experiments on manipulation tasks, and agree they're convincing. I was also happy with your explanation on GAILfo vs GAIL vs your algorithm, and your discussion on Sun et al 2019. Your decision to release code also helps with any fears I have about reproducibility. I have changed my score to an 8 to reflect these improvements. Easy to follow the logic and thoughts of the authors.

gail, imitation learning, minimizing inverse dynamic disagreement, (11 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Robots (0.70)
Information Technology > Artificial Intelligence > Machine Learning (0.55)

Add feedback

Reviews: Imitation Learning from Observations by Minimizing Inverse Dynamics Disagreement

Neural Information Processing SystemsFeb-4-2025, 23:37:31 GMT

Learning from Observation (LoF) is harder, but more practical, than Learning from Demonstration (LfD) that involves both action and state supervisions. The paper studies the difference between the two types of learning in both theoretical and practical perspectives, and relates the gap between LfD and LfO to inverse dynamics disagreement between the imitator and the expert. The paper includes an elaborate and interesting theoretical analysis of this gap, and proposes a method for bridging the gap through entropy maximization. The empirical evaluation is also thorough and includes both a toy problem for studying the effect of inverse dynamics discrepancy, MuJoCO problems and an ablation study. The reviewers are in agreement that this is a good, technically sound paper.

learning, minimizing inverse dynamic disagreement, observation, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.76)

Add feedback

Imitation Learning from Observations by Minimizing Inverse Dynamics Disagreement

Neural Information Processing SystemsOct-11-2024, 05:03:33 GMT

In contrast to Learning from Demonstration (LfD) that involves both action and state supervisions, LfO is more practical in leveraging previously inapplicable resources (e.g., videos), yet more challenging due to the incomplete expert guidance. In this paper, we investigate LfO and its difference with LfD in both theoretical and practical perspectives. We first prove that the gap between LfD and LfO actually lies in the disagreement of inverse dynamics models between the imitator and expert, if following the modeling approach of GAIL. More importantly, the upper bound of this gap is revealed by a negative causal entropy which can be minimized in a model-free way. Considerable empirical results on challenging benchmarks indicate that our method attains consistent improvements over other LfO counterparts.

imitation learning, learning, minimizing inverse dynamic disagreement, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.99)

Add feedback

Imitation Learning from Observations by Minimizing Inverse Dynamics Disagreement

Yang, Chao, Ma, Xiaojian, Huang, Wenbing, Sun, Fuchun, Liu, Huaping, Huang, Junzhou, Gan, Chuang

Neural Information Processing SystemsMar-18-2020, 20:30:22 GMT

In contrast to Learning from Demonstration (LfD) that involves both action and state supervisions, LfO is more practical in leveraging previously inapplicable resources (e.g., videos), yet more challenging due to the incomplete expert guidance. In this paper, we investigate LfO and its difference with LfD in both theoretical and practical perspectives. We first prove that the gap between LfD and LfO actually lies in the disagreement of inverse dynamics models between the imitator and expert, if following the modeling approach of GAIL. More importantly, the upper bound of this gap is revealed by a negative causal entropy which can be minimized in a model-free way. Considerable empirical results on challenging benchmarks indicate that our method attains consistent improvements over other LfO counterparts.

imitation learning, learning, minimizing inverse dynamic disagreement, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Filters

Collaborating Authors

minimizing inverse dynamic disagreement

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Imitation Learning from Observations by Minimizing Inverse Dynamics Disagreement

Reviews: Imitation Learning from Observations by Minimizing Inverse Dynamics Disagreement

Reviews: Imitation Learning from Observations by Minimizing Inverse Dynamics Disagreement

Imitation Learning from Observations by Minimizing Inverse Dynamics Disagreement

Imitation Learning from Observations by Minimizing Inverse Dynamics Disagreement